A 3-Component System for Text-to-Speech Synthesis

نویسندگان

  • Chris Lin
  • Qian Mu
  • Yi Shao
چکیده

Text-to-speech (TTS) synthesis is the text normalization from written input form to spoken output form, with applications in intelligent personal assistants (IPAs), voice-activated emails, and message readers for the visually impaired. In this project, we built a 3-component TTS text normalization system with token-to-token Naïve Bayes (NB) classifiers, a token-to-class Support Vector Machine (SVM) classifier, and hand-built grammar rules based on regular expression. The performance of this system reached a development accuracy of 99.36% and a testing accuracy of 98.88%. These results are comparable to those obtained by deep learning approaches. Keywords—Machine Learning; Natural Language Processing; Text Normalization; Support Vector Machine; Naïve Bayes

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

مراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی

Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...

متن کامل

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

Experiments with Unit Selection Speech Databases for Indian Languages

This paper presents a brief overview of unit selection speech synthesis and discuss the issues relevant to the development of voices for Indian languages. We discuss a few perceptual experiments conducted on Hindi and Telugu voices. 1 Role of Language Technologies Most of the Information in digital world is accessible to a few who can read or understand a particular language. Language technolog...

متن کامل

Design of English to Hindi Corpus Based Text Conversion and Hindi Text to Speech Synthesis

English is a global language but is understood by few percentage of population in India. It continues to remain a barrier for rural population to learn and compete at a global level. Machine translation helps people from different places to understand an unknown language without the aid of human translator. A Text to Speech system generatesspeech from text given as input. The proposed system wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017